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Abstract 

As shown by Medard ("The effect upon channel capacity in wireless communications of perfect and imperfect 

knowledge of the channel," IEEE Trans. Inf. Theory, May 2000), the capacity of fading channels with imperfect 
channel-state information (CSI) can be lower-bounded by assuming a Gaussian channel input X with power P and 
by upper-bounding the conditional entropy h{X\Y,H), conditioned on the channel output Y and the CSI H, by the 
entropy of a Gaussian random variable with variance equal to the linear minimum mean-square error in estimating X 
from (Y, H). We demonstrate that, using a rate-splitting approach, this lower bound can be sharpened: by expressing 
the Gaussian input X as the sum of two independent Gaussian variables Xi and X2 and by applying Medard's lower 
bound first to bound the mutual information between Xi and Y while treating X2 as noise, and by applying the 
lower bound then to bound the mutual information between X2 and Y while assuming Xi to be known, we obtain 
a lower bound on the capacity that is strictly larger than Medard's lower bound. We then generalize this approach 
to an arbitrary number L of layers, where X is expressed as the sum of L independent Gaussian random variables 
of respective variances P^, 1= 1,...,L summing up to P. Among all such rate-splitting bounds, we determine the 
supremum over power allocations Pt and total nuinber of layers L. This supremum is achieved for L — > cx) and 
gives rise to an analytically expressible lower bound on the Gaussian-input mutual information. For Gaussian fading, 
this novel bound is shown to be asymptotically tight at high signal-to-noise ratio (SNR), provided that the variance 
of the channel estimation error H — H tends to zero as the SNR tends to infinity. 
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I. Introduction and Channel Model 

We consider a single-antenna memoryless fading channel with imperfect channel-state information (CSI), whose 
time-fc channel output Y[k] corresponding to a time-A; channel input X[k] — x E C (where C denotes the set of 
complex numbers) is given by 

Y[k] = {H[k]+H[k])x + Z[k], keZ (1) 

(with Z denoting the set of integers). Here, the noise {Z[k]}kei, is a sequence of independent and identically 
distributed (i.i.d.), zero-mean, circularly-symmetric, complex Gaussian random variables with variance E[|Z[fc]p] = 
A^o- The fading pair |(i?[fc], i/[fc])}^^^ is an arbitrary sequence of i.i.d. complex-valued random variables whose 
means and variances satisfy the following conditions: 

• H[k] has mean ^ and variance V; 

• conditioned on H[k] = h, the random variable H[k] has zero mean and variance V{h), i.e., 

E[H[k] I H[k] =h\ =0 (2a) 
E[\H[k]\^ I H[k] ^h]^ V{h). (2b) 

We assume that the joint sequence {(iJ[fc], //[fc])}^^^, the noise sequence {Z[k]}kez and the input sequence 
{X[fc]}fcgz are all three mutually independent. We further assume that the receiver is cognizant of the realization 
of {H[k]}kez, but the transmitter is only cognizant of its distribution. We finally assume that both the transmitter 
and receiver are cognizant of the distributions of {H[k]}k(£i, and {Z\k]]kei but not of their realizations. The H[k] 
can be viewed as an estimate of the channel fading coefficient 

H[k] ^ H[k] + H[k\. (3) 



Accordingly, H[k] can be viewed as the channel estimation error. From this perspective, the condition (|2a]) is, for 
example, satisfied when H[k] is the minimum mean-square error (MMSE) estimate of H[k] from some receiver 
side information. When H[k] =0 almost surely, we shall say that the receiver has perfect CSI. 

The capacity of the above channel ([T]l under the average-power constraint P on the channel inputs is given by 

Q 

C{P)^snvI{X-Y\H) (4) 

where the supremum is over all distributions of X satisfying E[|Xp] < P. Here and throughout the paper we omit 
the time indices k wherever they are immaterial. Since Q is difficult to evaluate, even if H and H are Gaussian, it 
is common to assess C{P) using upper and lower bounds. A widely-used lower bound on C{P) is due to Medard 

\H\'^P 



C{P) > E 



log 1 



V{H)P + No 



= RuiP). (5) 
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This lower bound follows from Q by choosing X to be zero-mean, variance-P Gaussian and by upper-bounding 
the differential entropy of X conditioned on Y and H as 

h{X\Y, H) = h{X - aY\Y, H) 
< h{X - aY\H) 



< E 



log (neEllX - aY\^ \ H] 



(6) 



for any a e C. Here the first inequality follows because conditioning cannot increase entropy, and the subsequent 
inequality follows because the Gaussian distribution maximizes differential entropy for a given second moment ||3] 
Th. 9.6.5]. By expressing the mutual information I{X;Y\H) as 

I{X;Y\H)^hiX)-h{X\Y,H) (7) 

and by choosing a so that aY is the linear MMSE estimate of X, the lower bound (|5| follows. 

When the receiver has perfect CSI so that E[y(i?)] — 0, the lower bound Rm{P) is equal to the channel capacity 



Ceoh(P) = E 



log 1 







(8) 



Consequently, for perfect CSI the lower bound (|5]l is tight. 

In contrast, when the receiver has imperfect CSI and V{H) and H do not depend on P, the lower bound (|5]l is 
loose. In fact, in this case the lower bound Rm{P) is bounded in P, whereas the capacity C{P) is known to be 
unbounded. For instance, if H is of finite differential entropy, then the capacity has a double-logarithmic growth 
in P |4|Q 

This boundedness of Rm{P) is not due to the inequalities in (|6| being loose, but is a consequence of choosing 
a Gaussian channel input. Indeed, if H is of finite differential entropy, then a Gaussian input Xc achieves ijs] 
Proposition 6.3.1], ||4] Lemma 4.5] 

^^I{Xg;Y\H) < 7 + log(7reE[|iJ + - h{H) (9) 

where 7 w 0.577 denotes Euler's constant and where lim denotes the limit superior. Nevertheless, even if we 
restrict ourselves to Gaussian inputs, the lower bound 

I{Xc;Y\H)>Rm{P) (10) 

is not tight. As we shall see, by using a rate-splitting (or successive-decoding) approach, this lower bound ( [TO] i can be 
sharpened: we show in Section[ll]that, by expressing the Gaussian input Xc as the sum of two independent Gaussian 
random variables Xi and X2, and by first applying the bounding technique sketched in (|6]l-(|7]i to I(Xi; Y\H) (thus 
treating HX2 as noise) and then using the same bounding technique to lower-bound I{X2;Y\H, Xi), we obtain 

'This result can be generalized to show that if E[log |J/ + J/p] > —00 holds, then the capacity grows at least double-logarithmically with 

P. 
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a lower bound on the Gaussian-input mutual information (and thus also on the capacity) that is strictly larger than 
the conventional bound Rm{P)- 

In Section [III] we expand this approach by expressing X as the sum of L > 2 independent Gaussian random 
variables Xg, £ = 1,...,L and by applying the bounding technique from (|6])-(|7]l first to I{Xi;Y\H), then to 
I{X2;Y\H , Xi), and so on. We show that the so obtained lower bound is strictly increasing in L (provided that 
we optimize the sum of bounds over the powers Pe = E[|X£p], £ = 1, . . . , L), and we determine its limit as L 
tends to infinity. The so-obtained lower bound permits an analytic expression. In the remainder of this paper, we 
shall refer to ^ as a layer and to L as the number of layers. 
In Section 



IV 



we show that when, conditioned on H, the estimation error H is Gaussian, and when its average 
variance (averaged over H) tends to zero as the SNR tends to infinity, the new lower bound tends to the Gaussian- 
input mutual information I{Xg-,Y\H) as the SNR tends to infinity. For non-Gaussian fading, we show that, at 
high SNR, the difference between I{Xc\Y\H) and our lower bound is upper-bounded by the difference of the 
logarithms of the variance of H and of its its entropy power. 

The rest of the paper is organized as follows. In Section |V] we discuss the connection of our results with similar 



results obtained in the mismatched-decoding literature. In Sections VI and VII we provide the proofs of the main 



results. And in Section VIII we conclude the paper with a summary and discussion. 



II. Rate- Splitting With Two Layers 

For future reference, we state Medard's lower bound (|5]) in a slightly more general form in the following 
proposition. 

Proposition 1 (Medard |2|). Let S be a zero-mean, circularly-symmetric, complex Gaussian random variable of 
variance P. Let A and B be complex-valued random variables of finite second moments, and let C be an arbitrary 
random variable. Assume that S is independent of {A^ C), and that, conditioned on (A, C), the variables S and 
B are uncorrelated. Then 



I{S;AS + B\A,C) > E 



log 1 



VBiA,C), 

where VB{a,c) denotes the conditional variance of B conditioned on {A,C) = (a, c). 



(11) 



Proof See Appendix [A] ■ 
Using Proposition [l] we show that, for imperfect CSI and E[|_ffp] > 0, rate splitting with two layers strictly 
improves the lower bound ( [TO) l. Indeed, let Xi and X2 be independent, zero-mean, circularly-symmetric, complex 
Gaussian random variables with respective variances Pi and P2 (satisfying Pi + P2 = P) such that X ^ Xi +X2. 
By the chain rule for mutual information, we obtain 

IiX;Y\H)=I{Xi,X2;Y\H) 

^IiXi;Y\H)+I{X2;Y\H,Xi). (12) 
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By replacing the variables A, B, C and S in Proposition [T] with 

H, B ^ HX2 + HX + C ^ 0, S ^ Xi 



it follows that the first term on the right-hand side (RHS) of ( [T2] i is lower-bounded as 



I{X^-.Y\H)>E 



log 1 



(|H|2 + V{H))P2 + V{H)Pi + No^ 
Similarly, by replacing A, B, C, S in Proposition [T] with 



^i?l(Pl,P2)- 



(13) 



A^H, B ^ HXi + HX + Z, C ^ Xi, 8^X2 
we obtain for the second term on the RHS of ( [T2| l 



I{X2;Y\H,Xi) > E 



log 1 



\H\^P2 



-i?2(Pl,P2) 



(14) 



V{H){\Xi\^ + P2)+No^ 

Noting that, for every a > 0, the function x 1— > log(l + a/x) is strictly convex in x > 0, it follows from Jensen's 
inequality that the RHS of ( [l4j i is lower-bounded as 



log 1 



\H\^P2 



V{H){\X,\^+P2) + N„ 



> E 



log 1 



\H\^P2 



V{H){P,+P2) + No 



(15) 



Rl{Pl,P2)+R2iPl,P2) > E 



log 1 



(16) 



with the inequality being strict except in the trivial cases where Pi — 0, P2 = 0, or if, with probability one, at 
least one of and V{H) is zero|^Thus, combining ([T2]i-([T5]l, we obtain 

\H\^P 
V{H)P + Nq^ 

demonstrating that, when the receiver has imperfect CSI, rate splitting with two layers strictly improves the capacity 
and mutual information lower bound (|5]l (except in trivial cases). 

Figure [T| compares the two-layer bound Ri{Pi,P2) + i?2(^'i7^2) with Rm{P) (dashed line) as a function of 
Pi/P, for H and H being circularly-symmetric Gaussian with parameters fi = 0, V — ^, V(h) = ^ for h E C, 
P = 10, and A^o = 1- The figure confirms our above observation that, when the receiver has imperfect CSI and 
Pi > and P2 > 0, rate splitting with two layers outperforms Rm{P) In this example, the optimal power 
allocation is approximately at Pi 0.78P and P2 sa 0.22P. 

One might wonder whether extending our approach to more than two layers can further improve the lower bound. 
As we shall see in the following section, it does. In fact, for every power P we show that, once that the power is 
optimally allocated across layers, the rate-splitting lower bound is strictly increasing in the number of layers. 

^We may write this as Pr{H ■ V{H) = 0} = 1. For example, this occurs when the receiver has perfect CSI, in which case V{H) = 
almost surely. 



Januaiy 28, 2013 



DRAFT 



6 



0.83 



m 0.81 - 



Ri{Pi,P-Pi) + R2{PuP-Pi) 
Rm{P) 




Fig. 1. Comparison of the 2-layer lower bound Ri(Pi, P — Pi) + P2(Pi, P ^ Pi) (continuous line) with Medard's lower bound RuiP) 
(dashed line) as a function of the power fraction Pi /P assigned to the first layer. 



III. Rate- Splitting With L Layers 
Let Xi, . . . , X]^ be independent, zero-mean, circularly-symmetric, complex Gaussian random variables with 



respective variances Pi, . . . , Pl satisfying 



such that 



Let the cumulative power Qk he given by 



i=i 

L 

1=1 

k 



(17) 



We denote the collection of cumulative powers as 

Q-(Qi,...,Ql) 

and refer to it as an L-layering. 

It follows from the chain rule for mutual information that 

L 

I{X^-Y\H) = Y,I{X,-Y\X'-\H) 



(18) 



(19) 



(20) 



where we use the shorthand to denote the sequence ^i, . . . , Am, and AP denotes the empty sequence. Applying 
Proposition [T] by replacing A, B, C, S with the respective 

A^H, B ^ Hj2^e' + HX + Z, C ^ X'^-\ S ^ X^, 
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we can lower-bound the £-th summand on the RHS of (|20ll as 



I Xf:Y\X'-\H) > E 



= Ri[Q] (21) 



where 



r,,^{X'-\H) ^ ^— „ , . '^''^l — (22) 

V{H)\Y: + V{H)P, + + j:p, + no 

i<l i>l 

and where the last line in ( |2T] i should be viewed as the definition of -R£[Q]. Defining 

R[CI]^Ri[Q] + ... + Rl[Q.] (23) 
we obtain from ( |20] l and ( |2T] l the lower bound 

liX; Y\H) = I{X^:Y\H) > i?[Q]. (24) 

Note that = Qi implies Pi ~ 0, which in turn implies Re[Q,] = 0. Without loss of optimality, we can 

therefore restrict ourselves to L-layerings satisfying 

< Qi < . . . < Ql - P. (25) 

We shall denote the set of such L-layerings by Q{P,L). 

Let R*{P,L) denote the lower i?[Q] optimized over all Q e Q{P,L), i.e., 

R*{P,L)^ sup i?[Q]. (26) 

QeQ(-P,L) 

In the following, we show that R*(P,L) is monotonically increasing in L. To this end, we need the following 
lemma. 

Lemma 2. Let L' > L , and let the L-layering Q G Q{P,L) and the L' -layering Q' G Q{P, L') satisfy 

{Qi,..-,Ql}c{Q[,...,Q'l,}. (27) 

Then 

R[Cl] < R[Q'] (28) 

with equality if, and only if PrjiJ • V{H) = O} = 1. 

Proof: See Appendix [5] ■ 

Theorem 3. Assume that Pr{H ■ V{H) = O} < L Then, R*{P,L) is monotonically nondecreasing in L. 

Proof: For every L-layering Q S Q(P, L), we can construct an [L + l)-layering Q' e Q{P, L + 1) satisfying 
Q C Q' by adding (Qi + Q-i)!^ to Q. Together with Lemma |2] this implies that for every Q e Q{P,L) there 
exists a Q' e Q{P, L + 1) such that -R[Q] < R[Q'], from which the theorem follows upon maximizing both sides 
of the inequality over all layerings Q G Q{P, L) and Q' e Q{P, L + 1), respectively. ■ 
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It follows from Theorem [3] that the best lower bound, optimized over all layerings of fixed sum-power P 



R*{P) ^ sup sup i?[Q] = sup R*{P, L) 
LeNQeQ{P,L) Lew 



(29) 



(where N denotes the positive integers) is approached by letting the number of layers L tend to infinity. An explicit 
expression for R*{P) is provided by the following theorem. 

Theorem 4. For a given input power P, the supremum of all rate- splitting lower bounds i?[Q] over Q e Q{P,L) 
and X e N, is given by 



R*{P) = lim R*{P,L) 

L— >-oo 



= E 



\H\'' + V{H) 



P 







V{H){W-l)- 



No 
P 



where 

i log(l + x), if -1 < x < or a: > 
1, if X = 

and where W is independent of H and exponentially distributed with mean 1. 

Proof The proof of Theorem [4] is given in Section VI 



(30) 



(31) 



Remark 1. The proof of Theorem ^hinges on the observation that the supremum R*{P) is approached by a 
uniform layering 



U(P,L)^ 



P P P 
^,2-,...,(L-l)-,P 



(32) 



when the number of layers L is taken to infinity. While this layering was chosen for mathematical convenience, any 
other layering would also do, provided that some regularity conditions are met. For example, one can show that 
for any Lipschitz-continuous monotonic bijection F: [0, P] — > [0, P], we have 



lim i?[P(U(P,L))] = lim R[U{P,L)] =R*{P) 



L— >-oo 



(33) 



where P(U(P, L)) = {F{P/L), F{2P/L), F{P)}. 



To assess the tightness of the derived lower bounds, we consider two upper bounds on the mutual information 
for Gaussian inputs. The first upper bound is the capacity when the receiver has perfect CSI [cf. ([8])] and follows 
by noting that improving the CSI at the receiver does not reduce mutual information: 

liJPP^ 



loa- 1 



IiXc;Y\H) < E 
The second upper bound is given by 

I{Xg:Y\H) <RMiP) + E 

— -^upper(-^) 



Ccoh(P). 



(34) 



log 



/ ViH)P + No " 
\S{H)PW + No 



(35) 
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where W is an exponentially distributed random variable of mean 1, and where 'l>{h) denotes the conditional entropy 
power of H, conditioned on H ~ 

1 If h[H\H ^ h) > -oo 

(36) 



0, 



if h{H\H ^h)> -oo 
otherwise. 



This upper bound follows from expanding the mutual information as h{Y\H) — h{Y\XG, H), upper-bounding 
h{Y\H) by the entropy of a Gaussian variable of same variance, and lower-bounding h{Y\XG, H) using the 
entropy-power inequality ||6] Theorem 6]. 

The upper bound ( (35] l was previously used, e.g., in ||7j Equation (42)], |[8] Lemma 2] for Gaussian fading, in which 
case the entropy-power inequality is tight and the entropy power equals the conditional variance, i.e., <P{h) = V{h) 
for heC. 



2 
1.8 
1.6 



CD 
c 

§ 1.4 

o 

^ 1.2 

■Q, 
C 

o 



E 0.8 

i 0-6 

CD 

1 0.4 



0.2 













— - — ^ 








^^C*coh(-P) 




-^uppei'{-^^) 




R*(P) 




--fl*(P,2) 


1 1 1 1 1 


Rm{P) 



-10 



5 10 15 
P/No [dB] 



20 



25 



30 



Fig. 2. Comparison between several bounds on the capacity and the Gaussian-input mutual information. 



In Figure |2j several bounds on the mutual information I{Xc',Y\H) for Gaussian inputs are plotted against the 
SNR on a range from — lOdB to 30 dB. From top to bottom, we have the coherent capacity (|34]); the upper 
bound ([35|; the supremum R*{P) of all rate-splitting bounds (Theorem |4|; the two-layer rate-splitting bound with 
optimized power allocation R*{P, 2); and Medard's lower bound Rm{P)- The grey-shaded area indicates the region 
in which the curve of the exact Gaussian-input mutual information I{Xg; Y\H) is located. For this simulation, we 
have chosen H and H to be independent and complex circularly-symmetric Gaussian with parameters ii — 0, 

^We define h{H\H = h) = — oo if the conditional distribution of H, conditioned on H = h, is not absolutely continuous with respect to 
the Lebesgue measure. 
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V = \, and V{}i) = ^, h E C Observe that the proposed rate-spUtting approach achieves the most significant rate 
gains at high SNR. In this simulation, the increase R*(P) — Rm{P) is approximately 0.28 bits per channel use for 
large P. 

IV. Asymptotically Optimal CSI 

The numerical example considered in the previous section (see Figurejijl assumes that V{H) and H do not depend 
on the SNR P/Nq. However, in practical communication systems, the channel estimation error — as measured by 
the mean error variance E[y(i/)] — typically decreases as the SNR increases. In this section, we investigate the 
high-SNR behavior of the rates achievable with and without rate splitting when E[F(/f )] vanishes as the SNR tends 
to infinity. 

A. Asymptotic Tightness 

Let us consider a family of joint distributions of {H,H) parametrized hy p = P/Nq. To make this dependence 
on p explicit, we shall write the two channel components as Hp and Hp, and the respective variances as Vp and 
Vp{Hp). Similarly, we shall write the entropy power, defined in (|36|, as <Pp{Hp). We further adapt the notation to 
express Medard's lower bound, the rate-splitting lower bounds ( p6| ) and p9] ), and the upper bounds ( |34| i and ( |35| ) 
as functions of p, namely, Rm{p), R*{p,L), Ccohip), and /upper(p)- 

We assume that H = Hp + Hp does not depend on p and is normalized: 

E[\Hp\']+E[VpiHp)]^l. (37) 

We further assume that the variance of the estimation error Hp is not larger than the variance of H, i.e., Vp{hp) < 1 
for every hp E C 

Theorem 5. Let Hp, Vp{Hp) and Sp{Hp) satisfy 

lim E[Vp{Hp)] = (38a) 

p— )-oo 

I5i E\\Hp\^] < oo (38b) 

p->oo 

IS \ sup \ < M (38c) 

p^°° [iec <PpiO J 

for some finite constant M, where we define 0/0 == 1 and a/0 = co for every a > 0. Then, we have 

151 {l{XG-,Y\Hp)-R*{p)] < log(M). (39) 

Proof: See Section |VII] ■ 
If Hp is Gaussian, then we have Vp{hp) ~ (l>p{hp) for hp £ C and the choice M = 1 satisfies ([38c]). Thus, for 
Gaussian fading the lower bound R*{p) is asymptotically tight: 
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Corollary 6. Conditioned on every Hp = hp, let Hp be Gaussian, and let lim E[yp(/fp)] = 0. Then, we have 

lim {l{XG-,Y\Hp)-R*{p)] = 0. (40) 

In ||5|, it was argued that the difference between Rm{p) and Ccoh(p) ( [34] i vanishes as p tends to infinity only if 
Vp{Hp) decays faster than the reciprocal of p. In this case, Medard's lower bound is asymptotically tight. In fact, 
if the fading is Gaussian, then it can be shown that 

lim {i{Xg;Y\Hp) - Ru{p)\ = ^ lim p£[Vp{Hp)] = 0. (41) 

This is in stark contrast to R*{p)'- Corollary |6] demonstrates that, for Gaussian fading, the lower bound R*{p) is 
asymptotically tight as long as E[V"p(_ffp)] vanishes as p tends to infinity, irrespective of the rate of decay. 



B. Prediction- and Interpolation-Based Channel Estimation 

We evaluate the lower bounds Rm{p), 2), and R*{p) together with the upper bound /uppei-(/o) for two 

specific channel estimation errors satisfying ([38a]). We assume that Hp and Hp are zero-mean, circularly-symmetric 
Gaussian random variables that are independent of each other, the former with variance Vp and the latter with 
variance Vp. We consider variances Vp of the forms 

Vp=i^ + -T P'""-'-- (42a) 



2B 



and 



Vp = (42b) 



2BT 
P + 2BT 

for some < B < ^, where T = [l/(2i3)J is the largest integer not greater than 1/(25). 

As we shall argue next, ( |42a| i corresponds to prediction-based channel estimation, whereas ( |42b| l corresponds to 
interpolation-based channel estimation: 

Suppose for a moment that the fading process {H[k]}kez is not i.i.d. (as assumed in Section|l| but is a zero-mean, 
unit-variance, stationary, circularly-symmetric, complex Gaussian process with power spectral density 

MA)J^' (43) 
0, B<|A|<i 



for some < B < ^. The fading's autocovariance function is determined by ) through the expression 

.1/2 

E[iH[k + m] ^ p){H[k] - p)*] = e'2-™Vif(A)dA (44) 

where ( )* denotes complex conjugation and i = \/— T. 

We obtain ( |42a| i if we let H[k] be the minimum mean-square error (MMSE) predictor in predicting H[k] from 
a noisy observation of its past 

H[k - l]^/P + Z[k - l],H[k - 2]\/p + Z[k - 2], . . . (45) 



Januaiy 28, 2013 



DRAFT 



12 



Indeed, in this case H[k] and H[k] ^ H[k] — H[k] are zero-mean, circularly-symmetric Gaussian random variables 



that are independent of each other, the latter with mean zero and variance |9), 1 10 Equation (11)] 



1\ .. 1 



Vp = exp I y" ^^^^ log (^/^ (A) + ^ j dA I - ^ . (46) 



For the power spectral density ( |43] l this gives ( |42a| l. Note that, even though the lower bounds Rm{p), R*ip, L), and 
R*{p) were derived for i.i.d. fading {Hp[k]^Hp[k]]k(zi, by evaluating them for Hp[k] having variance (|46]l, they 
can be used to derive lower bounds on the capacity of noncoherent fading channels with stationary fading having 
power spectral density fni')', see jToj. 

The variance ( |42b| i corresponds to a channel-estimation scheme where the transmitter emits every T time instants 
(say at k = nT, n e Z) a pilot symbol y/P and where the receiver estimates the fading coefficients at the remaining 
time instants k (i.e., where k is not an integer multiple of T) from the noisy observations 

H[nT]VP + Z[nT], neZ (47) 

using an MMSE interpolator; see, e.g., pTj-p3]. When the power spectral density ///(•) is bandlimited to B and 
when T < 1/{2B), it can be shown that the variance of the estimation error is given by |14| 

V^p-l-T^^dA. (48) 



l-B PfnW+T 

For the power spectral density ( |43] l this gives ( |42b[ ). Again, even though the lower bounds R^ip), R*{p,L), and 
R*{p) were derived for i.i.d. fading {iJp[fc], i?p[A:]}fegz, by evaluating them for Hp[k] having variance ( |42b| l, they 
can be directly used to derive lower bounds on the capacity of noncoherent fading channels with stationary fading 
having power spectral density /if(-), provided that we account for the rate loss due to the transmission of pilots. 
In fact, it was shown that, when 1/(25) is an integer, the above interpolation-based channel estimation scheme 
together with Medard's lower bound Ru{p) achieves the capacity pre-log p2) , jTsj ^ 



C. Numerical Examples 



Figure 3(a) shows the lower bounds Rm{p), 2), and R*{p) together with the upper bounds lupperip) and 

shows the same bounds as 



Ccohip) as a function of p for Hp having variance ( |42a| l, with B = 1/4. Figure 3(b) 
a function of the energy per information bit. The shaded area constitutes the area where the mutual information 
corresponding to Gaussian inputs may lie. Observe that, in contrast to the curves in Figure |2] all curves are 
unbounded in the SNR, which is a consequence of the fact that Vp vanishes as p tends to infinity. Further observe 
that the shaded area decreases as p grows. This is consistent with Corollary |6] which states that for Gaussian fading 
the lower bound R*{p) is asymptotically tight. 



Figure 4(a) shows the lower bounds Rm{p), R*{p,2), and R*{p) together with the upper bounds /upper(p) and 



Ccohip) as a function of p for Hp having variance ( |42b| i, with BT = 1/2. Again, observe that all curves are 

"^The capacity pre-log is defined as the limiting ratio of the capacity to log(p) as p tends to infinity. In multiple-input multiple-output (MIMO) 
systems, it is sometimes also referred to as the number of degrees of freedom or the multiplexing gain. 
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(a) Mutual information bounds as a function of the SNR. 



(b) Mutual information bounds as a function of the minimum 
energy per information bit. 



Fig. 3. Prediction-based channel estimation 




P [dB] 



4 6 8 10 

Et,/No [dB] 



(a) Mutual information bounds as a function of the SNR. 



(b) Mutual information bounds as a function of the minimum 
energy per information bit. 



Fig. 4. Interpolation-based channel estimation 



unbounded in the SNR and that the lower bound R*{p) is asymptotically tight as p tends to infinity. In fact, R*{p) 
is close to /upper (p) for a large range of SNR. Further observe that, at high SNR, all upper and lower bounds have 
the same logarithmic slope. This fact was used in p2) , p3) to derive tight lower bounds on the capacity pre-log 
of noncoherent fading channels. 

Fig. |5]shows the lower bounds Rm{p), R*{p, 2), and R*{p) together with the upper bounds /upper(p) and Ccoh(p) 
normaUzed by Rm{P) as a function of p for Hp having variance ( |42b| l. Observe that, as p tends to zero, the ratios 
of the lower bounds R*{p,2) and R*{p) to Rm{P) tend to one. Thus, at low SNR, rate splitting only provides 
moderate rate gains. 
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Fig. 5. Mutual information bounds divided by Rm{p) as a function of the SNR. 



V. Relationship to Mismatched Decoding 

We have demonstrated that Medard's lower bound Rm{P) on the capacity of fading channels with imperfect CSI 
can be sharpened by using a rate-splitting approach: by expressing the Gaussian input X as the sum of L Gaussian 
random variables Xi, . . . , Xj^, by applying the chain rule for mutual information to express I{X; Y\H) as 

L 

IiX;Y\H) = Y,l{Xf,Y\x'-\H) (49) 



and by lower-bounding each mutual information on the RHS of ( |49| ) using Medard's bounding technique, we obtain 
a lower bound that is strictly larger than Rm{P)- 

This result is reminiscent of a result in the mismatched decoding literature. Indeed, it has been shown that 
Medard's lower bound Rm{P) is the generalized mutual information ( GMlf \ p5|-|[T7| of the above channel ([T]| 
when the codebook is drawn according to an i.i.d. Gaussian distribution and when the decoding rule is the scaled 
nearest neighbor decoding rule under which the decoder chooses the message m that minimizes (5] Corollary 3.0.1] 

1 " I 2 

i5(m) = -V y[fc]-/i[fc]x(")[fc] . 

rj ^ — ^ 



n 



(50) 



Here, (^('"^[l], . . . denotes the codeword associated with the message to G {1, ... , [e"^J }, and R and n 

denote the rate and the blocklength of the code, respectively. It has been further shown that, for a given decoding 
rule, treating the single-user channel as a multiple-access channel (MAC) can sometimes yield an achievable rate 

'For a given channel and decoding rule, the GMI is the rate below which the average probability of eiTor — averaged over the ensemble of 
i.i.d. codebooks — decays to zero as the blocklength tends to infinity, and above which this average tends to one. 
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that is larger than the GMI or other achievable rates corresponding to codebooks under which the codewords are 
drawn independently p8) . 

Since the above rate-splitting approach treats the single-user channel ([TJ as an L-user MAC with channel inputs 
i.e., 

L 

Y ^^{H + H)Xi, + Z (51) 

it may therefore seem plausible that said approach can sharpen Medard's lower bound. However, note that, in contrast 
to y_8 1 where the decoding rule is held fixed and the gain in achievable rate is due to the fact that treating the 
single-user channel as a MAC allows us to consider codebooks under which the codewords are not necessarily drawn 
independently, here the sharpening of the lower bound is due to a change of the decoding rule. For example, treating 
([TJ as a two-user MAC ( fSTj ), the rate R2{Pi; P2) corresponding to two-layer rate splitting [cf. ( [I4| i] can be achieved 
by dividing the message m into the submessages {mi, 7712), by drawing for each submessage a codebook according 
to an i.i.d. Gaussian distribution, and by employing a decoder that chooses the pair (mi,TO2) that minimizes 

n 



Di{mi) = -J2\y[k]~h[kWi 



(52) 



k=l 

and 

2 



1 -» I 



n 

k=l 



(53) 



Here x'f^'^l], . . . ,x^l^'\n], £ = 1,2 denotes the codeword associated with message mi, and rhi denotes the 
submessage that minimizes Di{mi). While treating the single-user channel Q as a MAC ( [STj l gives rise to a 
codebook under which codewords 

x[^''> [1] + x^"-^^ [1], . . . , xi'^'^n] + xi'^'^n] 

corresponding to different messages m — (7711,7712) are not drawn independently, in contrast to [[Tsj, this is not 
the reason why the achievable rate is increased. In fact, it can be shown that the same codebook together with the 
scaled nearest neighbor decoding rule 

7^(7711,7712) = -J2\y[k] - hikjxi'^'^k] - hlkjxi'^'^k] (54) 



71 

fc=l 



yields an achievable rate that is not larger than Rm{P)- 

In a nutshell, Medard's lower bound Rm{P) corresponds to the GMI for the scaled nearest neighbor decoding 
rule ( |54] l, whereas the rate-splitting lower bounds i?[Q] correspond to a recursive decoding rule as in ( |52j i and 
( |53| l. The results from Sections III and specifically Lemma |2] demonstrate that this recursive decoding rule yields 



a larger achievable rate than the scaled nearest neighbor decoding rule. 

VI. Proof of Theorem|4] 
To prove Theorem |4] we shall first show that it suffices to consider uniform layerings 

V{P,K)^(^,2^,...,{K-l)^,p\. (55) 
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Specifically, we shall show that for every L-layering Q e Q{P, L) there exists some K such that U(P, K) 
outperforms Q, i.e., 

R[V{P,K)]> R[Q]. (56) 

This then implies that 

R*{P) = sup \ sup i?[Q] i = sup i?[U(P, K)] (57) 
ieN I QeQ(P,-L) I Km 



from which we obtain 



R*{P)^ lim i?[U(P,A')] (58) 



upon noting that U(P, if) C U(P, 2^) for every fc e N. 

To show that for every L-layering Q e Q(P, L) there exists some U(P, K) outperforming Q, we first note that 
one can find two (L + l)-layerings S e Q(P, L + 1) and T e Q(P, L + 1) satisfying Q C S and T C U(P, X) 
such that for every e > and sufficiently large K, we have 

max \Ti - SA < e. (59) 

l<i<L+l 

Indeed, S may be obtained by including {Qi + Q2)/2 into Q, i.e., S = Q U {(Qi + Q2)/2}. Furthermore, for K 
larger than P/(mini<^<L 15*^+1 - Se\), choosing 



Tf 



P 

(where [a;] denotes the smallest integer larger than x) yields T C U(P, K) and 



max \Ti, ~Si,\<— (60) 

1<1<L+1 K 



from which ( |59] l follows. To prove ([56|, we then need the following lemma. 
Lemma 7. The function P[Q] satisfies 

lim ^ P[Q] = P[Q'] (61) 
where Q — > Q' is to be understood as max^ \Qi — Q'i\ ^ Q- 

Proof: See Appendix [C] ■ 
From Lemma |7] and from the observation (|59]l, it follows that for every 5 > Q there exists a sufficiently large K 
such that 

|i?[T] - P[S]| < 5. (62) 

Since, by Lemma |2] we have 

P[Q] < P[S] and P[T] < P[U(P, K)] (63) 

this yields 

P[Q] < P[S] < P[T] + (5 (64) 
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which for a sufficiently small S is strictly smaller than i?[U(P, A')]. This proves (|56|. 

Recalling that ( |56| l implies ( |58] l, we continue by evaluating i?[U(P, X)] in the limit as K tends to infinity. To 
this end, we write R[XJ{P, K)] as 



K 

i?[U(P, K)]=Y,^ [log(l + n,u{W,, H)) 
1=1 



(65) 



with [cf. ((22 



and 



- l)^W, + + + V{H)){P l^) + TVo 

\El 

V{H){1 - \)Wi + V[H) + (|7?|2 + V{H))iK -£) + No^ 
0, 



£ = 2,..., a:. 



(66) 



(67) 



The random variables {Wi, . . . ,Wk) are dependent but have equal marginals. (Each marginal has a unit-mean 
exponential distribution.) Since the RHS of ( |65] l depends on {Wi, . . . , Wk) only via the marginal distributions, we 
can thus express i?[U(P, K)] as 



R[\5{P,K)] = E 



K 



^iog(i + r,,u(w^,ff)) 



(68) 



where W is independent of H and has a unit-mean exponential distribution. 
Combining ([68]) with dSSl) yields 



We next show that 



and evaluate X^fci A,u(W^, H) for every (PV, H) — (w, h) in the limit as K tends to infinity. To this end, we first 
lower-bound R*{P) using Fatou's lemma ||19| and the lower bound log(l -\- x) > x — x"^ /2, a; > 0: 



R*{P) = lim E 



R*{P) = E 



K 



J2^og{l + re,uiW,H)) 



K 



lim yre^u{W,H) 



(69) 



(70) 



R*[P) = lim E 

i^— >-oo 



if 



> E 



> E 



^iog(i + r,,uWi?)) 

K 

lim ^log(H-P£^u(VF,7?)) 

K 

lim 5I^^.u(W^,^) 



if — >oo 



-Ie 

2 



if — >-oo -"^ — ' L 



(71) 
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where lim denotes the limit inferior. It follows that the second term on the RHS of ( |7T] l is zero. Indeed, we have 
for every [W, H) = (w, h) 

\h\' 



r{^ij{w,h) 



< 



vCh){e - l)w + V{h) + {\h\^ + V{h)){K -e) + Nof 

\h\' 



from which we obtain 

K 



< 



. {vCh)w, + vCh)) } (X - 1) + Vih) + No 
K\h\^ 



2 ' 



,{vih)w, {\h\^ + vCh)) ]iK-l) + V{h) + N„ 



(72) 



(73) 



Since X]f=i rf ,u{w,h) is nonnegative, and since the RHS of ( |73] l vanishes as K tends to infinity, it follows 
that, for every {W,H) = {w, h). 



= 0. 



Combining ( |74] i with ( |7T| yields 



lim \ri,ij{w,h) 

K 

lim 5I^«,u(W^,i?) 



We next show that 



R*{P) > E 



R*{P) < E 



if 



lim Vr,,u(w^,^) 



(74) 



(75) 



(76) 



To this end, we first use the upper bound log(l + x) < x, x > to obtain 



R*{P) = lim E 

if — >oo 



< lim E 

if — >oo 



if 



J2^og{l + n,u{W,H)) 

.i=i 

K 



Noting that, for every {W, H) = (Wjh), the sum inside the expectation is upper-bounded by 



\h\^ _ P\hl_ 
No 



if 



Cih) 



(77) 



(78) 



and noting that, since H has a finite second moment, we have that < E[({H)] < oo, we obtain ( |76| l upon applying 
Fatou's lemma to the function {w, h) i— > C(^) ^ X^fci ri^\j{'w, h). 
It remains to show that, for every (W, H) — {w, h), 

\h\^ „ f vih){w -l)-\h\^^ 



if 



if 



lim }riu(w/h) = — ^^6* I - ~ - »r 



where 6'( ) is defined in pT| ). This then implies that ( |75| ) and (|76]l coincide and 

\H\^ „ (V{H){W -1)-\H\ 



R*iP) = E 



-0 



(79) 



(80) 



which proves Theorem [4] 
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To show ( |79l l, we express the denominator in r£ u{w, h) as at + hK + c with 



a = V{h){w - 1) - 



allowing us to write 



If a = 0, then this becomes 



K 



K 



\h\' 



1=1 



ai + bK + c' 



K 



lim ri,;u{w,h) = 



if-*- 



We next consider the case a^i). Note that 

lim (y 



e=i 



\h\' 



K 

E 



VbK + c ^ ae + bK 
1 e=i , 



hideed, by the triangle inequality, we have 

K ,f|2 K 



E 



E 



_^ a£ + bK + c 



hK 



< 



K 

E 



{a£ + bK + c){a£ + bK)' 



(81a) 
(81b) 
(81c) 

(82) 

(83) 

(84) 

(85) 



Since the two factors (afc + bK + c) and {ak + bK) appearing in the denominator are both positive affine functions 
of k with equal coefficient a, their product takes its extremal values at fc = 1 or fc = K, depending on the sign of 
a. If a > 0, then 



If a < 0, then 



K 

E 

1=1 



K 

E 



VbK + c f^^ae + bK 



\h\' 



K 

E 



VbK + c ^ a£ + bK 
1 1=1 



< 



< 



K\c\\h\' 
{a + bK + c){a + bK)' 



Kiel 



{{a + b)K + c){a + b)K' 
Since the RHS of ([86|l and of ([87]i vanish as K tends to infinity, this yields ((84]i. Consequently, 



(86) 



(87) 



K 



lim y 



\h\' 



K 



= lim y 

K=^r^. ^ ' 



K^oo ^ a£ + bK + c if-j-oo ' 
e=i i=i 



bK 



'^oo K ^ 

' \h\' 



K ^^aj^+b 



\h\ 



log 1 



(88) 



where the third step follows by noting that the function a; ^ ^^^^ is Riemann integrable, so the Riemann sum 
converges to the integral. 
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Using the definition of 0{-), it follows from ([83| and ( |88| ) that 



K 



lim ri ij{w, h) = 



\h\ 



K^oo 



+ + ^ \^ \h\^+V{h) + ^ 



(89) 



thus proving ( |79l ), which in turn proves Theorem]?] 



VII. Proof of Theorem]?] 
To prove Theorem ]5] we show that, in the limit as the SNR tends to infinity, the difference 

IiXc;Y\Hp)-R*ip) (90) 

is upper-bounded by log(M) provided that P8b| i-( ]38c] ) are satisfied. We express R*{p) as E[R*{p,W,H)] with 



R*{p,w,0 = © I ^^^^ 



{p> 0,w>0,£_ e 

z.j-r p ' \ \z,\- -^ vpy^) ^ p ^ ) 
and upper-bound I{Xq;Y\Hp) using ( ]35] i: 

/(Xg; Y\ilp) - R*ip) < E[Rm{p, Hp)] + E[A{p, W, Hp)] ~ E[i?*(p, Hp)] 
= £[S{p,Hp)] 

where we have defined 



(91) 



i?M(p,C)=l0g 1+7. 



Z\(p,w,^) = log 



Vp{0+p-\ 
^piOw + p-^l 



, {p>0,w>0,£_e 
{p > 0,w >0,^ e 



and 



Sip, = Ruip, + E [A{p, W, 0] - E [R*ip, W, 0] , (p > 0, e G C) . 

We next show that 

IS E[i;(p,7?p)] <log(M). 

To this end, we write ( |92| > as 

E[r(p,iJ,)] =E[i:(p,7?,)l{|i?,| <eo}l +E[i;(p,7?p)l{|i7,| >eo} 



(92) 

(93) 
(94) 

(95) 

(96) 

(97) 



where I{-} denotes the indicator function (it is 1 if the statement in the curly brackets is true and is otherwise). 
We then show that 



lim lim E 



S{p,Hp)l{\Hp\<^o] 



= 



and 



lim lim E 



S{p,Hp)l{\Hp\>^\ <log(Af). 
To prove (|98a|i, we need the following two lemmas. 



(98a) 



(98b) 
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Lemma 8. As p tends to infinity, we have 



lim sup S{p, C) < 7 + log(M). 

p-i-oo 



(99) 



Proof: See Appendix [D| 
Lemma 9. Lef Vp{Hp) and Hp satisfy ( |38a| i anc/ ( |38c| i. T/zen 



lim lim Pr{\Hp\ > ^o} = 1- 

Proof See Appendix |E] 
Lemma |8] implies that, for every e > there exists a po > such that 



(100) 



supZ'(p,0 < 7 + log(M + e), p>po. 



Consequently, for p > po we have 



E[s{p,Hp)l{\Hp\<^o}] < (7 + log(M + e))Pr{|ffp| <^o}. 
Together with Lemma |9] this yields ( |98a| i upon taking limits on both sides of ( |102| l: 

lim IS {E\s{p,Hp)l{\Hp\<^o}]} 

< (7 + log(M + e)) (lim IS! Pr{|i/p| < Co}) 



(101) 



(102) 



0. 



(103) 



To prove ( |98b| l, we first upper-bound S(p,£^) by lower-bounding E[R* {p,W, ^)] for p > and |^| > ^0 using 
that R*{p,w,£,) is nonnegative: 



E[R^p,W,0] > 



where 



R*{p,w,Oe-^dw, (p>0, |C|>^o) 

^0^ 



(104) 



(105) 



'VpiO + p-' 

This choice for k{p, ^) together with the assumption V^(^) < 1 ensures that {w — l)Vp{^) — |f p is negative for all 
values of the integration variable w and for all |^| > ^o- Using the definition pT] ) of the function 0{-) in ( |9T| ), the 
lower bound ( |104| i reads as 



E[R*{p,W,0] 



> 



(^i' - i)x^p(e) - lei- 



log 1 



\^? + Vp{0 + p-' 



e~^dw, (p>0, ICI >Co). 

(106) 



Combining ( [T06] l with (|93]l-(|95]l yields 



<iog 1 



log 



VpiO + p-' 



VpiO + p-'J Jo \<Pp{Ow + p-' 

1^1 log 



iw-l)VpiO-\C\' 



i^-l)Vp{0-\^\ (p>0,|^|>eo). (107) 
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Upon reordering terms, this inequality can be written as 

^(P, < ^1 (P, + J2 (P, + ^3 (P, C) + Ji (P, 



with 



Ji(p,C)= / log 
Jo 

J2(P,0 



V^p(0^^ + P 
,<^p(Ow + P"' 



Up, c) ^ iog(|eP + K(0 + p-^) r^''^^ (1 - ^^)T/p(g) ^-^ 



J4(p,e) " 



f=(P:?) 



(1 ~ w)VpiO 



We proceed by showing that, for every > 0> 



lim E[Ji(p,iJp)l{|7?p| > Co}] < log(A/) 
En E[j,(p,ff,)l{|7?p| >eo}] <0, 1 = 2,3,4. 



p— >-oo 



(108) 

(109a) 
(109b) 
(109c) 
(109d) 

(110a) 
(110b) 



The claim ( |98b| i follows then by combining ( |1 10a| i and ( |110b[ ) with ( |108| l and by letting i^o tend to zero from above. 
To prove ( |1 10a[ ) and ( |1 10b| i, the following lemma will be useful. 

Lemma 10. Consider the family of random variables Tp parametrized by p > taking values on (0, 1] and satisfying 
limp_5.oo E[Tp] = 0. Let /(•) be a continuous bounded function on the interval (0, 1] with limit limt^o/(i) = fa- 
Then, we have 



lim E[/(rp)] =/o. 



(111) 



Proof: See Appendix |F] 

A. Limit related to Ji {p, ^) 

Noting that #p(C) < Vp{^), we have that 



^piOw + p-^ 



is monotonically increasing in w. Consequently, Ji{p,£,) is upper-bounded by 



/"'(p.C) Y (f) 



< 



1 — exp — 



log sup -~ 



VpiO 



'VpiO + p-' 

Averaging ( |1 12[ ) over iJp, and upper-bounding 

>6} <1 



C6C <l'p(C) , 



(P>0, 1^1 > Co). 



(112) 



(113) 
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yields 



E[Mp,Hp)l{\Hp\>^o}] < E 



1 — cxp — 



^2 
?0 



p > 0. 



(114) 



Noting that the function 1 1-^ exp (— Co/v^) is continuous and bounded on (0,oo) and vanishes as t tends to zero, 



it follows from (|38ab and Lemma fTOl that 



lim E 



exp 



so 



We further have, by p8c| l and the continuity of x i-^ log(a;), that 

lim log ( sup <log(M). 
Combining jllS) and ( |116| l with ( |114[ ) proves ( |110a| i. 



(115) 



(116) 



B. Limit related to J2 {p, S,) 

To upper-bound J2{p,(,), we use that, for w > k{p,^), 



p 



<pp{Ow + p-^ ^p{i)w + p-^ -ppiOw + p-^ 



1 



< SUP^ ~ , 



(117) 



where the first inequality follows by lower-bounding p ^ > and w > k{p, ^) in the denominator of the first fraction 
and by lower-bounding 'Pp{^)w > in the denominator of the second fraction; and where the second inequality 



follows by lower-bounding k{p,^) > Co / V 1 + P using Vp{$,) < 1 and by maximizing over ^. Combining ( |1 17 
with ( |109b[ ) yields 



^2(p,e) <l0g 1 + SUp 



exp 



«6cl^p(0J Co' 

?o 



log 1 + sup<^ 



(VpiO\VT±Zl 



'Vp{^)+p-iJ V eecUp(OJ Co 
Averaging ( |1 18[ ) over /fp, using ( |1 13| l, and upper-bounding log(l + x) < a; gives 

E[j2{p,Hp)l{\Hp\>^o}] 
so 



(118) 



< E 



exp 



'Vp{Hp) + p-^ 

Combining \119) with \115\ and (|38c| proves \llQh) for i = 2 



sup< > -7^ , p > 0. 

CGcl^>p(C)J Co' 



(119) 
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C. Limit related to J3 (p, ^) 
To prove ( |1 10b| l for i = 3, we shall prove the stronger statement 

\Mp,Hp)\l{\Hp\>^o} 
To this end, note that, by the triangle inequality. 



lim E 

p— voo 



(1 - 



< 



0<W<Kip,O- 



(120) 



(121) 



(i-«(p,o)^p(e) + eo' 

In ( |121| i we have also used that the denominator (1 — w)Vp{(,) + is positive for < it; < k{p,£^) and that 
1^1 > ^0, so the denominator is lower-bounded by (l — K{p,£_))Vp{£_) + £_q. 
It follows from ( |121| i that the absolute value of the integral in ( |109c| i is upper-bounded by 

-<p,o)Vp{o+eo' 



Jo (1 
Consequently, we have 

^3 



«(p.?) (I _ (c) 

{l-n^)Vp{0 + \^\' 



V J (1 



(122) 



< 



{l + n{p,0){m)+p-') 



\og{\e+Vp{o+p-') , (p>o, 1^1 > Co) 



(123) 



where we define (a)+ = max(a,0)|^ Here the last step follows by upper-bounding Vp{^) < Vp{£,) + p ^ and by 
lower-bounding (l - k(p,0)^p(C) > -(k(p,0 - {VpiO + P~^) and e-'*(''^«) > 0. 

Using the definition ( fT05| ) of 0, and defining Tp{^) = yp(0 + the RHS of ( fHSj l reads as 



H{\^f + rp{0)\ 



(124) 



eo'-(eoVy^-i)"^p(e) 

Furthermore, using that Vp{^) < 1 and that x 1— > log(a;) is a monotonically increasing function satisfying log(l+a;) < 
X, gives 

iog(rp(0) < iog(|cP + TpiO) < p-' + \e- (125) 

The absolute value of the logarithm on the RHS of ( |124[ ) is thus upper-bounded by 

|log(|eP+Tp(e))| <max{|log(Tp(0)|,(p-^ + iei')} < |log(Tp(0) | + + ICp. (126) 
By noting that 



H{rpiO)\ 



The condition < 1 ensures tliat the denominator is still positive. 
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is a continuous and bounded function of < Tp{£^) < 1 + p ^ that vanishes as Tp(^) tends to zero, we obtain from 
(13 8 all and Lemma fTOl that 



lim E 

p— ^oo 



+ \\og{T,{H,))\l{\Hp\>i^] 



-1) Tp{Hp, 



< lim E 

p-4-OO 



|iog(r,(7?p))| 



= (127) 
where the inequality follows from ( |1 13| ). Furthermore, ( |113| l together with the Cauchy-Schwarz inequality yields 



1 



p 



H,ni{\Hp\>^ 



< E 



JL 



p-' + \Hp\' 



1 3-p(ffp) 







Y 


< 


E 








A 



?0 



^0 



1 3-p(^p) 



(128) 



Note that the term inside the first expected value is a continuous and bounded function of < Tp(^) < 1 + p ^ 



that vanishes as T'p(^) tends to zero, so it follows from ([38a]) and Lemma 10 that the first expected value on the 
RHS of ( |128| l vanishes as p tends to infinity. Further note that, for sufficiently large p, the second expectation on 
the RHS of (fT28]l is bounded since, by (|38bl). 



lim E 

p— f oo 















= lim E 




< oo 






p— >-oo 







(129) 



Consequently, the above arguments combine to demonstrate that 



lim E 



1 Tp{Hp) 



P 



Hp?)l{\Hp\>^o] 



Combining ([130} and ^\TT\ with ( [T26| ) and ( [T23| ) proves ( |120| l. 



(130) 



D. L/w/f related to J^^p,^,) 

To prove ( |1 10b| i for i = 4, first note that < 1 implies that, for sufficiently large p, we have 

VpiOw + p-^ <1, 0<w<k{p,O- 



(131) 
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Further note that t/{t+ is monotonically increasing on (— |^p,oo). Consequently, for sufficiently large 
p, ( |109d[ ) is upper-bounded by 



v,{0 + \C\'Jo 



k(p,«) 



(l log :^+^ ' |logH|e-'"du; 



(132) 



where the second inequality follows by lower-bounding log{Vp{^)w + p^^) > log(Vp(^)) + log(u') and from the 
triangle inequality. 

By using that the exponential function is nonnegative, by upper-bounding the integral by integrating to infinity, 
and by using |^| > we can further upper-bound ( |132| i, for sufficiently large p, by 

1 



K + log^ 



(133) 



where we define 



poo 

K= |log(w)|e""'dw = 7-2Ei(-l) (134) 

^0 



and where Ei(-) denotes the exponential integral function, i.e. 



Ei(-x) = - / du. (135) 

Jx U 

Noting that the RHS of ( |133| l is a continuous and bounded function of < Vp(^) < 1 that vanishes as Vp{^) 
tends to zero, it follows from (|113|l, (|38a|, and Lemma fTOl that 



lim £[Ji{p,Hp)\{\Hp\ > Co}] < lim E[,hip,Hp)] < (136) 
thus proving ( |1 10b| i for i ~ 4. 

VIII. Summary and Conclusion 

We have demonstrated that rate splitting can increase the well-known capacity lower bound (jSj by Medard Q 
of fading channels with imperfect channel-state information at the receiver. By computing the maximum of these 
bounds over all possible rate-splitting strategies, we have established a novel capacity lower bound which is larger 
than Medard's lower bound (|5]). 

Viewing said capacity lower bound as a lower bound on the Gaussian-input mutual information, we have studied 
the high-SNR behavior of the novel bound under the assumption that the variance of the channel estimation error 
tends to zero with the SNR. We have shown that, for Gaussian fading, the rate -splitting bound is asymptotically 
tight in the sense that its difference to the Gaussian-input mutual information vanishes as the SNR tends to infinity. 
In contrast to Medard's lower bound, which is only asymptotically tight if the variance of the estimation error 
decays faster than the reciprocal of the SNR, the novel lower bound is asymptotically tight irrespective of the speed 
at which this variance decays. 



Januaiy 28, 2013 



DRAFT 



27 



In ||5], Lapidoth and Shamai have shown that Medard's lower bound corresponds to the GMI for i.i.d. Gaussian 
codebooks and a scaled nearest neighbor decoder From this mismatched decoding perspective, the combination 
of rate splitting (at the transmitter) and successive decoding (at the receiver) corresponds to a modification of the 
encoding/decoding rule which turns out advantageous in achieving higher transmission rates. 

Appendix A 
Proof of Proposition[T] 

We expand the mutual information as 

I{S; AS + B\A, C) = h{S\A, C) - h{S\AS + B, A, C). (137) 

Since, by assumption, S is zero-mean, variance-P, circularly-symmetric Gaussian and independent of (A, C), the 
first entropy on the RHS of ( |137| l is readily evaluated as 



h{S\A,C) = h{S) = log(7reP). 
Conditioned on {A, C) = (a, c), the second entropy can be upper-bounded as follows: 

h{S\AS + B,A = a,C = c) = h{S - aA,c{AS + B- fiB\A,c) \ AS + B, A ^ a,C = c) 

< h{S - OA^ciAS + B- hb\a,c) \A = a,C = c) 



(138) 



<log(7reE \S - aA,ciAS + B - fiB\A,c)\ 



A^a,C 



(139) 



for any arbitrary aa,c G C, where fJ,B\a.c — ~ a,C — c]. Here the first inequality follows because 

conditioning cannot increase entropy, and the second inequality follows from the entropy-maximizing property 
of the Gaussian distribution. Combining ( |139| l with ( |138| l and ( |137| i thus yields for every {A, C) = {a, c) and aa,c 

P 



I{S: AS + B\A = a,C = c)> log 



E [ IS" - aA,c{AS + B - fiB\A, 



c) 



A^a.C 



We choose aa.c so that aa.c{0'S + B — f-i.B\a.c) is the linear MMSE estimate of S, namely, 
_ E [S{AS + B- hb\a,c)* \A^a,C^c\ _ a*P 
" E[\AS + B - fiB\A,c\^ \A^a,C^c\ ^ Wp + VbM 
where VB{a,c) denotes the conditional variance of B conditioned on {A,C) = (a, c). This yields 



1 5'- aA^ciAS + B - ^J-B\A,c)\ 



Consequently, combining ( |142| i with ( |140| l gives 



A = a,C = c 



P 



VB{a,c) 



a\^P + VB{a,c)- 



I{S; AS + B\A^a,C = c)> log 1 



\a\^P 
VB{a,c) 



(140) 



(141) 



(142) 



(143) 



Proposition [T] follows then by averaging over {A,C). 
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Appendix B 
Proof of Lemma[2] 

In order to prove Lemma |2] we shall demonstrate for every L £ N that, if the layerings Q G Q{P, L) and 
Q' e Q{P, L + l) satisfy 

{Qi,...,gL}c{Q'i,...,Q'i+i} (144) 

then i?[Q] < i?[Q'] with equality if, and only if, Pf{H ■ V{H) = 0} = 1. The general case where Q' e Q{P,L') 
for some arbitrary L' > L follows then directly from the case L' = L + 1 hy applying the above result {L' — L) 
times. 

Let the element in Q' that is not contained in Q be at position r = 1, . . . , L, i.e.j^ 

Qe = Qe, i^l,...,T-l (145a) 



and 



Qe = Q'e+i, £^T,...,L. 



We next express Fi^j^iX^ ^, H) in ( |22| for some general layering A as 



(145b) 



(146) 



Noting that for the layering Q the term |X]i<£^i| has an exponential distribution with mean Qe-i, whereas for 
the layering Q', it has an exponential distribution with mean Q'i_i, and using ( |145a| ) and ( |145b| i, it can be easily 
verified that 



\og{l + rtci{X'-\H)) 



and 



iog{i + ri.Q,{x'-\H)) 
log(l + ^,+l,Q,(x^i^)) 



£=l,...,r-l 



E[iog(i + r,,Q(x^-i,i/)) 

Subtracting i?[Q] from R[Q'] yields thus 

i?[Q'] - R[Q] = E \\og{i + rr,Q,{x^-\H)) 

iog(i + r,+i,Q,(x^ H))\ - E [iog(i + rr,Qix^-\H)) 

Since the random variables Xi, . . . , Xr, H are independent, we can express the second expectation as 



(147) 



(148) 



-X^-^,H 



\0g{l+rr+l,Q'{X\H)) 



(149) 



(150) 



where the subscript indicates over which random variables the expected value is computed. Using that, for every 
a > 0, the function x i— > log(l + a/x) is strictly convex in x > 0, it follows from Jensen's inequality that, for 
every X^^^ = x^^^ and H ~ h, the inner expectation is lower-bounded by 



iog(i + r,+i,Q.(x^-\x,,/i)) >iog{i + rr+i,Q'{x^-\h)) 



(151) 



'By the definition of a layering, we have Q'j^^i = Ql = so the element in Q' not contained in Q cannot be at position t = L + 1. 
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where we define 



rr + l,Cl'[X ,h} = — — ; ; ; ; 

V |E,<. + + v)p \h\^QWi yQ'r-i + 



and where the denominator of I^^+i q/(x^ , /i) is obtained by noting that has zero mean, so 









2" 































(152) 



(153) 



Since Q' e Q{P,L + 1) impUes that E [|Xt-P] > 0, the inequahty in ( |151| ) is strict except in the trivial cases 
V{h) = or /i = 0. Combining ([T50| and ( fBT) yields 

E [log(l + r,+i,Q,(X^7?))] > E [log(l + r,+i,Q,(X^-i,iJ)) 

which together with ( |149| l gives 

i?[Q']-i?[Q] > E[log(l + r,.QKX^-\i?))l 4 



(154) 



log 



i + r,+i,Q,(x--i,H) 



i + rrMx-\H) 



(155) 



with the inequality being strict except if Pr{H ■ V{H) = O} = 1. 

We next use ( |145a| i and ( |145b| i and the fact that |X]i<T"'^«r ™ exponential distribution with mean Q'^-i 
under both layerings Q and Q' to evaluate the second expected value on the RHS of ( |155[ ): 



log 



l + r.+l,Q'(^"'\g) 

i + rr.ci{x-\H) 



log 



T-\H\'Q',-V[H)Q',_^+No 



log 1 



T~\H\^Q'^_^-V{H)Q'^_^+N„^ 



T-\H\^Q'^-V{H)Q'^_^+N„ 



where we introduce 



for ease of exposition. Noting that 



T = V{H) 



Y^x. 



T-\H\^Q',^V{H)Q'^_, + N^ 
yields that the RHS of \155\ is zero, thus demonstrating that 



{\H\^ + V{H))P 



rr.Q>{x--\H) 



R[Q] < R[Q'] 

with equality if, and only if, Pr{H ■ V{H) = O} = 1. This proves Lemma |2] 

Appendix C 
Proof of Lemma[2] 



(156) 



(157) 



(158) 



(159) 



We show that 



where Q — > Q' should be read as 



lim R\Q] = i?[Q'] 

Q^Q' 



max \Qi — Q'i\ —J- 0. 



(160) 



(161) 
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To this end, we write i?[Q] as 



with 



A,Q(W,,iI) = 

(assuming that Qq = 0) and 



R[Q] = ^ E [log(l + ri,ci{W,,H)) 



\H\^iQi-Qe.i) 



V{H)Wj,Qe-i + V{H){Q, - g,_i) + {\H\^ + V{H)) [P ~ Q,) + 



0, 

W,^\ 1 



Using that, with probabiHty one. 



< iog(i + n,ci) < 



and that H has finite variance, it follows from the Dominated Convergence Theorem |19| that 



lim E 



log{l + ri,Q(We,H)) 



Jim^ log(l + r,,Q(l¥,,i?)) 



(162) 



(163) 



(164) 



(165) 



(166) 



where the last step follows by noting that Q i-^ log(l + /^f q(X^ ^i^)) is a continuous function of Q. Combining 
( |166[ ) with ( |162| i proves ( |160| ) and, hence. Lemma |7] 



Appendix D 
Proof of Lemma[8] 

We first note that specializing Theorem [i] to the case H = £, gives 
Consequently, we obtain from the definitions (|95]l and (|94| of S{p,£^) and Z\(p, w,^) that 

^m)+p-'\ 



= log 



log 



p'^piO, 



The second term on the RHS of ( |168| ) gives |20 

1 



log 



= log 



pMO . 



e"*""' Ei -- 



(167) 



(168) 



(169) 
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Consequently, we have 

E[Aip, W, 0] = log(l + pVpiO) + Ei ( ---^ 



log 1 + p$p{^)-J^ + e''*p«) Ei 



..o.(l.^.(0||).Ei(--4^) 

^,(^.(0^11) 070) 

where the inequality follows because Ei(— a;) is negative for x > and > 1, x > 0; and where the last step 
should be viewed as the definition of g{-;-)- 

For a fixed a, the function t H> g{t; a) satisfies |j4j Section VI-B] 

lim g(i;a) =7 + log(a). (171) 

We next show that, for every a > 1, the function 1 1-> g{t: a) is monotonically increasing. Indeed, we have 



d e^^ 



e* at — 1 — at 



> 71- ^ [a - 1] 

- {l + at)t ^ ^ 

> 0, a > 1 (172) 

where the second step follows from the lower bound > 1 + j, t > 0. 

Due to the entropy-maximizing property of Gaussian random variables, we have Vp{£,) > ^*p(0' which implies 
that VpiO/^piO > 1- It thus follows from ([T70]i-([T72]l that 



MO, 



< lim glt;^ 



t^OO 



MO, 



= 7 + log(^^j, (p>0,eeC) (173) 
Maximizing the RHS of ( |173[ ) over ^ G C and computing the limit as p tends to infinity gives 

lim sup Uip, ^) < lim log ( sup ) + 7 

< 7 + log(M) (174) 
where the last step follows from the continuity of a; i-> log(a;) and from P8c[ ). This proves Lemma |8] 
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Appendix E 
Proof of Lemma[9] 

For any e > we have 

Pr{\H\>^o + e} = Pr{\H\>Co + e,\Hp\<Co} + Pr{\H\>^o + ^AHp\>^o} (175) 

< Pr{\H-Hp\>e,\Hp\<^o} + Pr{\Hp\>^o} (176) 

where the inequaUty follows by upper-bounding the first probability on the RHS of ( |175| l using the triangle inequality 
and by upper-bounding the second probability on the RHS of ( |175[ ) using the law of total probability. Using the law 
of total probability and Markov's inequality pT| , the first term on the RHS of ( |176| ) can be further upper-bounded 
by 

Pr{\H~Hp\>e,\Hp\<^o} < Pr{\H~Hp\>e} 

^\Vp{Hp)] 

< -^-^V-^- (177) 

Combining ( |177| l with ( |176[ ) gives 

Pr{|iJ| > Co + 4 < ^i^^;^ + Pr{\Hp\ > Co}. (178) 

Since, by the lemma's assumptions, we have limp_>.oo E[V^(77p)] — 0, taking the limit inferior for p — > cx) on either 
side of ( fT78] i yields 

Pr{\H\ > Co + e} < lim Pr{\Hp\ > Co}- (179) 

We next note that, for sufficiently large p, the conditional distribution of Hp, conditioned on Hp = C' must be 
absolutely continuous with respect to the Lebesgue measure, since otherwise 'Pp{^) = 0, contradicting the Lemma's 
assumption that <^p(C) satisfies (|38c|. This implies that the cumulative distribution function of \H\ = \Hp + Hp\ is 
continuous. Consequently, we have 

\imPr{\H\ > Co + e} = Pr{|i/| >Co} (180) 

which together with ( |179| l gives 

lim Pr{|ffp| >eo} > Pr{|ff| >Co} (181) 

upon letting e tend to zero from above. Using the continuity of the cumulative distribution function of Lemma |9] 
follows from ( |181| l by noting that 

limPr{|ff| > Co} = Pr{|ff| > 0} = L (182) 
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Appendix F 
Proof of Lemma[To] 

For every family of random variables Tp parametrized by p > and taking values on (0,1], we have by Markov's 
inequality 

Pr {Tp >v}< ^i^, for every v > Q. (183) 



Using that limp_j.oo E[Tp] — 0, we thus have 

lim Pr {Tp > jy} = 0, for every v>Q (184) 

p— J-OO 

or equivalently, Wuip^^ Pv{Tp < z/} ^ 1. We upper-bound E[/(Tp)] for any i/ > as 

E[/(Tp)] = E [/(Tp) I{Tp <v}] + £ [/(Tp) I{Tp > u)] 

< sup fit) Pr{Tp <iy}+ sup /(i) Pr{Tp > ly} (185a) 
Similarly, we lower-bound E[/(Tp)] for any v > as 

E[/(Tp)]> inf /(i) Pr{Tp < t.} + inf /(i) Pr{Tp > i.} (185b) 

0<t<}y v<t<l 

Since /(■) is bounded, and due to ( |184[ t, taking limits for /? — > oo in ( |185a| t and ( |185b[ ) gives 

inf fit) < lim E[/(Tp)] < liE E[/(Tp)] < sup fit). (186) 

0<t<iy p_j.oo P^oo 0<t<i' 

Taking the limit as ly tends to zero from above, we finally obtain 

lim E[/(Tp)]=lim/(i)-/o (187) 

p— foo tlO 



which proves Lemma 10 
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